本文报告了建立在线语言学习工具的进步,以通过使用对话系统作为对话实践伙伴为学习者提供对话体验。我们的系统可以随时适应用户的语言水平。我们还提供自动语法错误反馈,以帮助用户从错误中学习。根据我们的第一个采用者,我们的系统娱乐和有用。此外,我们将为学习技术社区提供有关语言学习和语法校正的大规模对话数据集。我们的下一步是通过使用强化学习算法使我们的系统更适应用户配置文件。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
诸如说服力之类的复杂对话设置涉及交流态度或行为的变化,因此即使与主题没有直接相关,用户的观点也需要解决。在这项工作中,我们贡献了一个新颖的模块化对话系统框架,该框架将事实信息和社会内容无缝地整合到有说服力的对话中。我们的框架可以推广到任何混合社交和任务内容的对话任务。我们进行了一项研究,将用户对框架的评估与基线端到端生成模型进行了比较。我们发现,与没有明确处理社交内容或事实问题的端到端模型相比,我们的框架在包括能力和友善的各个方面更受欢迎。
translated by 谷歌翻译
We consider the problem of estimating a multivariate function $f_0$ of bounded variation (BV), from noisy observations $y_i = f_0(x_i) + z_i$ made at random design points $x_i \in \mathbb{R}^d$, $i=1,\ldots,n$. We study an estimator that forms the Voronoi diagram of the design points, and then solves an optimization problem that regularizes according to a certain discrete notion of total variation (TV): the sum of weighted absolute differences of parameters $\theta_i,\theta_j$ (which estimate the function values $f_0(x_i),f_0(x_j)$) at all neighboring cells $i,j$ in the Voronoi diagram. This is seen to be equivalent to a variational optimization problem that regularizes according to the usual continuum (measure-theoretic) notion of TV, once we restrict the domain to functions that are piecewise constant over the Voronoi diagram. The regression estimator under consideration hence performs (shrunken) local averaging over adaptively formed unions of Voronoi cells, and we refer to it as the Voronoigram, following the ideas in Koenker (2005), and drawing inspiration from Tukey's regressogram (Tukey, 1961). Our contributions in this paper span both the conceptual and theoretical frontiers: we discuss some of the unique properties of the Voronoigram in comparison to TV-regularized estimators that use other graph-based discretizations; we derive the asymptotic limit of the Voronoi TV functional; and we prove that the Voronoigram is minimax rate optimal (up to log factors) for estimating BV functions that are essentially bounded.
translated by 谷歌翻译
In the field of cross-modal retrieval, single encoder models tend to perform better than dual encoder models, but they suffer from high latency and low throughput. In this paper, we present a dual encoder model called BagFormer that utilizes a cross modal interaction mechanism to improve recall performance without sacrificing latency and throughput. BagFormer achieves this through the use of bag-wise interactions, which allow for the transformation of text to a more appropriate granularity and the incorporation of entity knowledge into the model. Our experiments demonstrate that BagFormer is able to achieve results comparable to state-of-the-art single encoder models in cross-modal retrieval tasks, while also offering efficient training and inference with 20.72 times lower latency and 25.74 times higher throughput.
translated by 谷歌翻译
The past few years have witnessed the prevalence of self-supervised representation learning within the language and 2D vision communities. However, such advancements have not been fully migrated to the community of 3D point cloud learning. Different from previous pre-training pipelines for 3D point clouds that generally fall into the scope of either generative modeling or contrastive learning, in this paper, we investigate a translative pre-training paradigm, namely PointVST, driven by a novel self-supervised pretext task of cross-modal translation from an input 3D object point cloud to its diverse forms of 2D rendered images (e.g., silhouette, depth, contour). Specifically, we begin with deducing view-conditioned point-wise embeddings via the insertion of the viewpoint indicator, and then adaptively aggregate a view-specific global codeword, which is further fed into the subsequent 2D convolutional translation heads for image generation. We conduct extensive experiments on common task scenarios of 3D shape analysis, where our PointVST shows consistent and prominent performance superiority over current state-of-the-art methods under diverse evaluation protocols. Our code will be made publicly available.
translated by 谷歌翻译
In this work, we introduce a hypergraph representation learning framework called Hypergraph Neural Networks (HNN) that jointly learns hyperedge embeddings along with a set of hyperedge-dependent embeddings for each node in the hypergraph. HNN derives multiple embeddings per node in the hypergraph where each embedding for a node is dependent on a specific hyperedge of that node. Notably, HNN is accurate, data-efficient, flexible with many interchangeable components, and useful for a wide range of hypergraph learning tasks. We evaluate the effectiveness of the HNN framework for hyperedge prediction and hypergraph node classification. We find that HNN achieves an overall mean gain of 7.72% and 11.37% across all baseline models and graphs for hyperedge prediction and hypergraph node classification, respectively.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have become increasingly important in recent years due to their state-of-the-art performance on many important downstream applications. Existing GNNs have mostly focused on learning a single node representation, despite that a node often exhibits polysemous behavior in different contexts. In this work, we develop a persona-based graph neural network framework called PersonaSAGE that learns multiple persona-based embeddings for each node in the graph. Such disentangled representations are more interpretable and useful than a single embedding. Furthermore, PersonaSAGE learns the appropriate set of persona embeddings for each node in the graph, and every node can have a different number of assigned persona embeddings. The framework is flexible enough and the general design helps in the wide applicability of the learned embeddings to suit the domain. We utilize publicly available benchmark datasets to evaluate our approach and against a variety of baselines. The experiments demonstrate the effectiveness of PersonaSAGE for a variety of important tasks including link prediction where we achieve an average gain of 15% while remaining competitive for node classification. Finally, we also demonstrate the utility of PersonaSAGE with a case study for personalized recommendation of different entity types in a data management platform.
translated by 谷歌翻译
Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.
translated by 谷歌翻译
This paper utilizes an anomaly detection algorithm to check if underwater gliders are operating normally in the unknown ocean environment. Glider pilots can be warned of the detected glider anomaly in real time, thus taking over the glider appropriately and avoiding further damage to the glider. The adopted algorithm is validated by two valuable sets of data in real glider deployments, the University of South Florida (USF) glider Stella and the Skidaway Institute of Oceanography (SkIO) glider Angus.
translated by 谷歌翻译